Principled Tools for Modeling and Visualizing 2D Vector Fields
2. ggvfields
geom_complex_function()
geom_stream_field(se = TRUE)
ggfunction
- Visual Improvements
- CRAN Publication
ggvfields can be understood by analogy with core ggplot2 concepts
Real-valued functions of real arguments \(f: \mathbb{R}\to \mathbb{R}\) are familiar from calculus
Their graphs are fundamental to their understanding (eg. \(f(x) = \sin(x)\))
ggvfields can be understood by analogy with core ggplot2 concepts
Real-valued functions of real arguments \(f: \mathbb{R}\to \mathbb{R}\) are familiar from calculus
Their graphs are fundamental to their understanding (eg. \(f(x) = \sin(x)\))
ggvfields can be understood by analogy with core ggplot2 concepts
Real-valued functions of real arguments \(f: \mathbb{R}\to \mathbb{R}\) are familiar from calculus
Their graphs are fundamental to their understanding (eg. \(f(x) = \sin(x)\))
ggvfields can be understood by analogy with core ggplot2 concepts
Vector fields are vector-valued functions of vector arguments \(f: \mathbb{R}^{n} \to \mathbb{R}^{m}\)
In this talk we’ll assume \(n = m = 2\), a common case in applications
Thought of in various ways
\(\displaystyle{\textbf{f}(x,y) = \textbf{f}\left(\left[\begin{matrix}x \\ y \end{matrix}\right]\right) = \left[\begin{matrix}f_{1}(x,y) \\ f_{2}(x,y) \end{matrix}\right]}\) “component functions”
Examples:
\(\displaystyle{\textbf{f}(x,y) = \textbf{f}\left(\left[\begin{matrix}x \\ y \end{matrix}\right]\right) = \left[\begin{matrix}-y \\ x \end{matrix}\right]}\)
\(\displaystyle{\nabla \ell(\mu,\sigma^{2}) = \left[\begin{matrix}\frac{\partial}{\partial\mu} \ell(\mu,\sigma^{2}) \\ \frac{\partial}{\partial\sigma^{2}} \ell(\mu,\sigma^{2}) \end{matrix}\right]}\)
f <- function(u) {
x <- u[1]; y <- u[2]
c(-y, x)
}
N <- 11
df <- expand.grid( "x" = seq(-1, 1, len = N), "y" = seq(-1, 1, len = N) )
df$fy <- df$fx <- NA
for (i in 1:nrow(df)) df[i,3:4] <- f( as.numeric(df[i,1:2]) )
df <- transform(df, "radius" = sqrt(fx^2 + fy^2), "angle" = atan2(fy, fx))
head(df)
# x y fx fy radius angle
# 1 -1.0 -1 1 -1.0 1.414214 -0.7853982
# 2 -0.8 -1 1 -0.8 1.280625 -0.6747409
# 3 -0.6 -1 1 -0.6 1.166190 -0.5404195
# 4 -0.4 -1 1 -0.4 1.077033 -0.3805064
# 5 -0.2 -1 1 -0.2 1.019804 -0.1973956
# 6 0.0 -1 1 0.0 1.000000 0.0000000ggfields
Mathematica 11.3
Mathematica 13.1
Challenges with results
Unattractive results Could be more visually appealing
Not informative Lacks clarity or detail
Challenges with code
Verbose syntax Very clunky and wordy
High user burden Lots of effort and mental load
Difficult to customize Hard to see now; examples to come
Not extensible Faceting? Superimposing different colors? etc.
Ridged Accept data one way
The results should be…
Correct Does what it claims to do
Clear Easily understandable
Attractive Visually appealing
The code should be…
Simple Straightforward and concise
Familiar Easy for a ggplot user
Flexible Accept functions or data of any form
| Item | PPP | Defense |
|---|---|---|
| Gas Price (per gallon) | $3.25 | $3.00 |
| Milk Price (per gallon) | $4.03 | $4.02 |
| Egg Price (per dozen) | $3.17 | $4.90 |
| Item | PPP | Defense |
|---|---|---|
| Gas Price (per gallon) | $3.25 | $3.00 |
| Milk Price (per gallon) | $4.03 | $4.02 |
| Egg Price (per dozen) | $3.17 | $4.90 |
| Dallas Cowboys Super Bowls Since 1996 | 0 | 0 |
| Taylor Swift’s Boyfriend | Travis Kelce | Travis Kelce |
| Item | PPP | Defense |
|---|---|---|
| Gas Price (per gallon) | $3.25 | $3.00 |
| Milk Price (per gallon) | $4.03 | $4.02 |
| Egg Price (per dozen) | $3.17 | $4.90 |
| Dallas Cowboys Super Bowls Since 1996 | 0 | 0 |
| Taylor Swift’s Boyfriend | Travis Kelce | Travis Kelce |
| ggvfields |
Install from CRAN
Install from Github
Load the package
Inputs
Users will come with vector data
- x, y, fx, fy
- x, y, xend, yend
- x, y, angle, direction
- x, y, angle_deg, direction
# x y fx fy radius angle xend yend
# 1 -1.0 -1 1 -1.0 1.414214 -0.7853982 0.0 -2.0
# 2 -0.8 -1 1 -0.8 1.280625 -0.6747409 0.2 -1.8
# 3 -0.6 -1 1 -0.6 1.166190 -0.5404195 0.4 -1.6
# 4 -0.4 -1 1 -0.4 1.077033 -0.3805064 0.6 -1.4
# 5 -0.2 -1 1 -0.2 1.019804 -0.1973956 0.8 -1.2
Stream data
- x, y, t
# x y t
# 1 -1.0 -1 1
# 2 -0.8 -1 2
# 3 -0.6 -1 3
# 4 -0.4 -1 4
# 5 -0.2 -1 5
And functions
- function(u) c(-u[2], u[1])
- function(u) c(sin(u[2] + u[1]), cos(u[2] - u[1]))
Visualization Types
Users will want both vector fields and stream fields.
Consider the vector field
\[ \textbf{f}(x,y) = (-y,\,x) \]
which yields the ODE system
\[ \frac{dx}{dt} = -y,\quad \frac{dy}{dt} = x \]
Euler’s method estimates the ODE solution by
\[ \mathbf{v}_{n+1} = \mathbf{v}_n + \Delta t\,\textbf{f}(\mathbf{v}_n) \]
Starting at \(\mathbf{v}_{0} = (1,1)\) and using \(\Delta t = 0.1\):
Iteration 1:
\(\textbf{f}(1,1) = (-1,\,1)\), so
\(\textbf{v}_{1} = (1,1) + 0.1\,(-1,1) = (0.9,\,1.1)\).
Iteration 2:
\(\textbf{f}(0.9,1.1) = (-1.1,\,0.9)\), so
\(\textbf{v}_2 = (0.9,1.1) + 0.1\,(-1.1,0.9) = (0.9-0.11,\,1.1+0.09) = (0.79,\,1.19)\).
# Integrate the system using ode
(stream <- ode(init, times, f_ode, parms = NULL, method = "euler"))
# time x y
# 1 0.0 1.00000000 1.000000
# 2 0.1 0.90000000 1.100000
# 3 0.2 0.79000000 1.190000
# 4 0.3 0.67100000 1.269000
# 5 0.4 0.54410000 1.336100
# 6 0.5 0.41049000 1.390510
# 7 0.6 0.27143900 1.431559
# 8 0.7 0.12828310 1.458703
# 9 0.8 -0.01758719 1.471531
# 10 0.9 -0.16474031 1.469772
# 11 1.0 -0.31171756 1.453298Reminder: A vector is a special case of a stream
init <- c(x = 1, y = 1); times <- seq(0, 1, by = 1)
(vector <-
data.frame(ode(init, times, f_ode, parms = NULL, method = "euler")) |>
print() |>
pivot_wider(names_from = time, values_from = c(x, y)) |>
mutate(fx = x_1 - x_0,
fy = y_1 - y_0,
angle = atan2(fy, fx),
angle_deg = angle * 180 / pi,
distance = sqrt(fx^2 + fy^2)
) |>
rename_with(~ str_replace_all(.x, c("_0" = "_start", "_1" = "_end")))
)
# time x y
# 1 0 1 1
# 2 1 0 2
# # A tibble: 1 × 9
# x_start x_end y_start y_end fx fy angle angle_deg distance
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 0 1 2 -1 1 2.36 135 1.41p3 <- p + geom_vector(aes(xend = x_end, yend = y_end), normalize = FALSE, center = FALSE)
p4 <- p + geom_vector(aes(fx = fx, fy = fy), normalize = FALSE, center = FALSE)
p5 <- p + geom_vector(aes(angle = angle, distance = distance), normalize = FALSE, center = FALSE)
p6 <- p + geom_vector(aes(angle_deg = angle_deg, distance = distance), normalize = FALSE, center = FALSE)
(p3 + p4) / (p5 + p6) + plot_layout(guides = "collect")generate_path <- function(u, step) {
init <- unname(c(x = u[1], y = u[2])); times <- seq(0, 1, by = step)
df <- data.frame(ode(init, times, f_ode, parms = NULL, method = "euler"))
colnames(df) <- c("time", "x", "y")
df
}
streams <-
expand.grid(x = seq(-1,1,.5), y = seq(-1,1,.5)) |>
apply(1, generate_path, step = .1) |> bind_rows(.id = "id")
streams |> group_by(id) |> slice_head(n = 4) |> group_split() |> map(x = c(1,2), ~pluck(.x))
# [[1]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 1 0 -1 -1
# 2 1 0.1 -0.9 -1.1
# 3 1 0.2 -0.79 -1.19
# 4 1 0.3 -0.671 -1.27
#
# [[2]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 10 0 1 -0.5
# 2 10 0.1 1.05 -0.4
# 3 10 0.2 1.09 -0.295
# 4 10 0.3 1.12 -0.186
#
# [[3]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 11 0 -1 0
# 2 11 0.1 -1 -0.1
# 3 11 0.2 -0.99 -0.2
# 4 11 0.3 -0.97 -0.299
#
# [[4]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 12 0 -0.5 0
# 2 12 0.1 -0.5 -0.05
# 3 12 0.2 -0.495 -0.1
# 4 12 0.3 -0.485 -0.150
#
# [[5]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 13 0 0 0
# 2 13 0.1 0 0
# 3 13 0.2 0 0
# 4 13 0.3 0 0
#
# [[6]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 14 0 0.5 0
# 2 14 0.1 0.5 0.05
# 3 14 0.2 0.495 0.1
# 4 14 0.3 0.485 0.150
#
# [[7]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 15 0 1 0
# 2 15 0.1 1 0.1
# 3 15 0.2 0.99 0.2
# 4 15 0.3 0.97 0.299
#
# [[8]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 16 0 -1 0.5
# 2 16 0.1 -1.05 0.4
# 3 16 0.2 -1.09 0.295
# 4 16 0.3 -1.12 0.186
#
# [[9]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 17 0 -0.5 0.5
# 2 17 0.1 -0.55 0.45
# 3 17 0.2 -0.595 0.395
# 4 17 0.3 -0.635 0.336
#
# [[10]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 18 0 0 0.5
# 2 18 0.1 -0.05 0.5
# 3 18 0.2 -0.1 0.495
# 4 18 0.3 -0.150 0.485
#
# [[11]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 19 0 0.5 0.5
# 2 19 0.1 0.45 0.55
# 3 19 0.2 0.395 0.595
# 4 19 0.3 0.336 0.635
#
# [[12]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 2 0 -0.5 -1
# 2 2 0.1 -0.4 -1.05
# 3 2 0.2 -0.295 -1.09
# 4 2 0.3 -0.186 -1.12
#
# [[13]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 20 0 1 0.5
# 2 20 0.1 0.95 0.6
# 3 20 0.2 0.89 0.695
# 4 20 0.3 0.820 0.784
#
# [[14]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 21 0 -1 1
# 2 21 0.1 -1.1 0.9
# 3 21 0.2 -1.19 0.79
# 4 21 0.3 -1.27 0.671
#
# [[15]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 22 0 -0.5 1
# 2 22 0.1 -0.6 0.95
# 3 22 0.2 -0.695 0.89
# 4 22 0.3 -0.784 0.820
#
# [[16]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 23 0 0 1
# 2 23 0.1 -0.1 1
# 3 23 0.2 -0.2 0.99
# 4 23 0.3 -0.299 0.97
#
# [[17]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 24 0 0.5 1
# 2 24 0.1 0.4 1.05
# 3 24 0.2 0.295 1.09
# 4 24 0.3 0.186 1.12
#
# [[18]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 25 0 1 1
# 2 25 0.1 0.9 1.1
# 3 25 0.2 0.79 1.19
# 4 25 0.3 0.671 1.27
#
# [[19]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 3 0 0 -1
# 2 3 0.1 0.1 -1
# 3 3 0.2 0.2 -0.99
# 4 3 0.3 0.299 -0.97
#
# [[20]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 4 0 0.5 -1
# 2 4 0.1 0.6 -0.95
# 3 4 0.2 0.695 -0.89
# 4 4 0.3 0.784 -0.820
#
# [[21]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 5 0 1 -1
# 2 5 0.1 1.1 -0.9
# 3 5 0.2 1.19 -0.79
# 4 5 0.3 1.27 -0.671
#
# [[22]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 6 0 -1 -0.5
# 2 6 0.1 -0.95 -0.6
# 3 6 0.2 -0.89 -0.695
# 4 6 0.3 -0.820 -0.784
#
# [[23]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 7 0 -0.5 -0.5
# 2 7 0.1 -0.45 -0.55
# 3 7 0.2 -0.395 -0.595
# 4 7 0.3 -0.336 -0.635
#
# [[24]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 8 0 0 -0.5
# 2 8 0.1 0.05 -0.5
# 3 8 0.2 0.1 -0.495
# 4 8 0.3 0.150 -0.485
#
# [[25]]
# # A tibble: 4 × 4
# id time x y
# <chr> <dbl> <dbl> <dbl>
# 1 9 0 0.5 -0.5
# 2 9 0.1 0.55 -0.45
# 3 9 0.2 0.595 -0.395
# 4 9 0.3 0.635 -0.336There is too much information for just two 2 aesthetics
Many attempts to address this provide understanding in some ways but lose understanding in others
center = TRUERecall the ODE system: \[ \textbf{f}(x,y) = (-y,\,x) \quad \text{(i.e., } \frac{dx}{dt}=-y,\; \frac{dy}{dt}=x\text{)} \]
Euler’s method estimates the solution: \[ \mathbf{v}_{n+1} = \mathbf{v}_n + \Delta t\,\mathbf{f}(\mathbf{v}_n) \]
center = FALSE:
\(\Delta t\): from \(0\) to \(T\)
center = TRUE:
\(+ \Delta t\): from \(0\) to \(T/2\)
\(- \Delta t\): from \(0\) to \(-T/2\)
center = TRUEcenter_backward <- ode(init, times = seq(0, -0.5, by = -0.1), f_ode, parms = NULL, method = "euler")
center_backward <- center_backward[nrow(center_backward):1, ]
center_forward <- ode(init, times = seq(0.1, 0.5, by = 0.1), f_ode, parms = NULL, method = "euler")
(center <- rbind(center_backward, center_forward))
# time x y
# [1,] -0.5 1.39051 0.41049
# [2,] -0.4 1.33610 0.54410
# [3,] -0.3 1.26900 0.67100
# [4,] -0.2 1.19000 0.79000
# [5,] -0.1 1.10000 0.90000
# [6,] 0.0 1.00000 1.00000
# [7,] 0.1 1.00000 1.00000
# [8,] 0.2 0.90000 1.10000
# [9,] 0.3 0.79000 1.19000
# [10,] 0.4 0.67100 1.26900
# [11,] 0.5 0.54410 1.33610center = TRUEcenter = TRUELnormalize = FALSEnormalize = TRUE:
- Default: Streams grow to length L provided by user or determined algorithmically
normalize = FALSE:
- Default: Streams grow to length L, then all are trimmed to the fastest time T
- Customization: Set T to control stream duration
center = FALSEcenter = FALSE: An arrow is drawn from \((x,y)\) to \((x,y) + f(x,y)\)
center = TRUE: an arrow is drawn from \((x,y) - \frac{1}{2} f(x,y)\) to \((x,y) + \frac{1}{2} f(x,y)\)
Lnormalize = TRUE: Vectors are scaled to a fixed length L (either user-specified or computed automatically)
normalize = FALSEnormalize = TRUE: Vectors are scaled to a fixed length L (either user-specified or computed automatically)
normalize = FALSE: Vectors retain their natural magnitude—that is, they start at \((x,y)\) and extend to \((x,y) + f(x,y)\) (with \(\Delta T = 1\))
What if we could do the same thing with vectors?
Consider the scalar function
\[ \phi(x,y) = \frac{1}{2}\left(x^2 + y^2\right) \]
Its gradient is a vector field is given by
\[ \nabla\phi(x,y) = \left(\frac{\partial \phi}{\partial x},\,\frac{\partial \phi}{\partial y}\right) = \left(x,\,y\right) \]
Consider the scalar function
\[ \phi(x,y) = \frac{1}{2}\left(x^2 + y^2\right) \]
Its gradient is a vector field is given by
\[ \nabla\phi(x,y) = \left(\frac{\partial \phi}{\partial x},\,\frac{\partial \phi}{\partial y}\right) = \left(x,\,y\right) \]
Consider the scalar function
\[ \phi(x,y) = \frac{1}{2}\left(x^2 + y^2\right) \]
Its gradient is a vector field is given by
\[ \nabla\phi(x,y) = \left(\frac{\partial \phi}{\partial x},\,\frac{\partial \phi}{\partial y}\right) = \left(x,\,y\right) \]
Express the potential function as a line integral:
\[ \phi(x,y)=\int_{(x_{start},y_{start})}^{(x,y)} \nabla \phi \cdot (dx,dy). \]
Integrate the x-component: \[ \phi(x,y)=\int x\,dx = \frac{1}{2}x^2 + g(y) \]
Differentiate with respect to y: \[ \frac{\partial}{\partial y}\phi(x,y)=\frac{\partial\phi}{\partial y}\left(\frac{1}{2}x^2+g(y)\right) = \frac{\partial\phi}{\partial y}g(y) = \frac{\partial \phi}{\partial y}=g'(y) = y \]
Solve for g(y) by integrating: \[ g(y) = \int g'(y)\,dy = \int y\,dy = \frac{1}{2}y^2 + C \]
So: \[ \phi(x,y) = \frac{1}{2}x^2+g(y) = \frac{1}{2}x^2+\frac{1}{2}y^2+C \]
Consider the scalar function
\[ \phi(x,y) = \frac{1}{2}\left(x^2 + y^2\right) \]
Its gradient is a vector field is given by
\[ \nabla\phi(x,y) = \left(\frac{\partial \phi}{\partial x},\,\frac{\partial \phi}{\partial y}\right) = \left(x,\,y\right) \]
Consider the scalar function
\[ \phi(x,y) = \frac{1}{2}\left(x^2 + y^2\right) \]
Its gradient is a vector field is given by
\[ \nabla\phi(x,y) = \left(\frac{\partial \phi}{\partial x},\,\frac{\partial \phi}{\partial y}\right) = \left(x,\,y\right) \]
Coulomb’s law gives the force between two charges
\[ \textbf{F} = \frac{q_{1} q_{2}}{\left|\left|\textbf{u}_{1}-\textbf{u}_{2}\right|\right|_{}^{3}}(\textbf{u}_{1}-\textbf{u}_{2}) \]
If multiple charges are present, superposition applies
\[ \textbf{F}(\textbf{u}) = \sum_{i=1}^{2} \frac{q q_{i}}{\left|\left|\textbf{u}-\textbf{u}_{i}\right|\right|_{}^{3}}(\textbf{u}-\textbf{u}_{i}) \]
Example:
\(q_{1} = +1\) at \((1,1)\)
\(q_{2} = -1\) at \((-1,-1)\)
Test charge \(q = +1\) at \((x, y)\)
\[ \textbf{f}(x, y) = \frac{1}{((x-1)^2+(y-1)^2)^{3/2}} \left[\begin{matrix}x-1 \\ y-1 \end{matrix}\right] - \frac{1}{((x+1)^2+(y+1)^2)^{3/2}} \left[\begin{matrix}x+1 \\ y+1 \end{matrix}\right] \]
f <- efield_maker(log = TRUE, charge_positions = rbind(c(1,1), c(3,3)))
xlim <- c(0,4); ylim <- c(0,4)
p1 <- ggplot() + geom_vector_field(fun = f, xlim = xlim, ylim = ylim)
p2 <- ggplot() + geom_vector_field(fun = f, xlim = xlim, ylim = ylim, normalize = FALSE, center = FALSE)
(p1 + p2) + plot_layout(guides = "collect")set.seed(123)
n <- 1000
pts <- data.frame(x = runif(n, 0, 4), y = runif(n, 0, 4))
noisy <- t(apply(pts, 1, function(u) f(u) + rnorm(2, 0, 0.05)))
rand_dat <- cbind(pts, setNames(as.data.frame(noisy), c("fx","fy"))) |> as_tibble()
p1 <- ggplot(rand_dat) + geom_vector(aes(x = x, y = y, fx = fx, fy = fy))
p2 <- ggplot(rand_dat) + geom_vector(aes(x = x, y = y, fx = fx, fy = fy), normalize = FALSE)
p1 + p2 + plot_layout(guides = "collect")Each component is estimated using a smooth basis expansion:
\[ \widehat{f}_x(x, y) = \sum_{j=1}^{K_x} \beta_j^{(x)} \, \phi_j^{(x)}(x, y), \quad \widehat{f}_y(x, y) = \sum_{j=1}^{K_y} \beta_j^{(y)} \, \phi_j^{(y)}(x, y) \]
\(\phi^{(x)}(x, y)\) and \(\phi^{(y)}(x, y)\) as the basis function vectors at location \((x, y)\) for each model.
\(\beta^{(x)}\) and \(\beta^{(y)}\) are the coefficient vectors estimated using least squares.
If \(\Sigma_x\) and \(\Sigma_y\) be the covariance matrices of \(\beta^{(x)}\) and \(\beta^{(y)}\). Then
\[ \operatorname{Var}[\widehat{\textbf{f}}(x, y)] = \begin{bmatrix} \phi^{(x)}(x, y)^T \Sigma_x \, \phi^{(x)}(x, y) & 0 \\ 0 & \phi^{(y)}(x, y)^T \Sigma_y \, \phi^{(y)}(x, y) \end{bmatrix} \]
The covariance matrix of a predicted vector is defined as:
\[ \boldsymbol{\Sigma}_{\widehat{\mathbf{v}}} = \mathbf{X}_{\text{new}}\,\mathbf{V}\,\mathbf{X}_{\text{new}}^\top + \boldsymbol{\Sigma}, \quad \mathbf{V} = \sigma^2 (\mathbf{X}^\top \mathbf{X})^{-1}. \]
\(\boldsymbol{\Sigma}_{\widehat{\mathbf{v}}}\) is the covariance matrix of the predicted vector field at a new location.
\(\mathbf{X}_{\text{new}}\) is the design matrix at the new points.
\(\mathbf{V}\) is the covariance matrix of the regression coefficients (\(\mathbf{\beta}\)).
\(\boldsymbol{\Sigma}\) is the observation noise or residual variance.
Eigen-decomposition:
\[ \boldsymbol{\Sigma}_{\widehat{\mathbf{v}}} = \mathbf{Q}\,\boldsymbol{\Lambda}\,\mathbf{Q}^\top, \quad \boldsymbol{\Lambda} = \operatorname{diag}(\lambda_1,\lambda_2). \]
Semi-axis lengths and orientation:
\[ a_i = \sqrt{\lambda_i\,\chi^2_{1-\alpha}}, \quad i=1,2, \qquad \theta = \arctan\left(\frac{q_{2,1}}{q_{1,1}}\right). \]
\(\mathbf{Y} = \mathbf{X}\mathbf{B} + \mathbf{E}\)
where
The least squares estimator is
\(\widehat{\mathbf{B}} = (\mathbf{X}^\top \mathbf{X})^{-1} \mathbf{X}^\top \mathbf{Y}.\)
Cokriging is a technique that extends ordinary kriging to jointly predict multiple, correlated spatial variables. It leverages the spatial cross-correlation between the variables to improve prediction accuracy.
# Linear Model of Coregionalization found. Good.
# [using ordinary cokriging]
# Linear Model of Coregionalization found. Good.
# [using ordinary cokriging]
\[ r = \sqrt{f_x^2 + f_y^2}, \quad \theta = \operatorname{atan2}(f_y, f_x), \] with Jacobian determinant \(|r|\). The joint density becomes:
\[ f_{r,\theta}(r,\theta) = f_{f_x,f_y}(r\cos\theta,\,r\sin\theta)\,|r|. \] Marginalize over r to get angular density:
\[ f_\theta(\theta) = \int_start^\infty f_{r,\theta}(r,\theta)\,dr, \qquad F_\theta(\theta) = \int_{-\pi}^\theta f_\theta(u)\,du. \] Confidence bounds for angular uncertainty:
\[ F_\theta(\theta_{\text{lower}}) = \frac{\alpha}{2}, \quad F_\theta(\theta_{\text{upper}}) = 1 - \frac{\alpha}{2}. \]
ggplot(rand_dat, aes(x = x, y = y, fx = fx, fy = fy)) +
geom_vector_field(fun = f, alpha = .3, color = "black", center = FALSE) +
geom_vector_smooth(pi_type = "wedge", method = "kriging") # Linear Model of Coregionalization found. Good.
# [using ordinary cokriging]
p <- ggplot(rand_dat, aes(x = x, y = y, fx = fx, fy = fy)) +
geom_vector_field(fun = f, alpha = .3, color = "black", center = FALSE)
# geom_vector(alpha = .3, color = "black")
p1 <- p + geom_vector_smooth(eval_points = eval_point, se.circle = TRUE)
p2 <- p + geom_vector_smooth(eval_points = eval_point, se.circle = TRUE, method = "lm")
p3 <- p + geom_vector_smooth(eval_points = eval_point, se.circle = TRUE, method = "kriging")
p1 + p2 + p3 + plot_layout(guides = "collect")# Linear Model of Coregionalization found. Good.
# [using ordinary cokriging]
What if we saw this regression equation as a \(\mathbb{R}^2 \to \mathbb{R}^2\) vector field?
After all, it takes in x/y data and outputs fx/fy data?
scalar_field <- function(u) {
mod <- lm(z ~ x + y + poly(x,2) + poly(y,2), data = sample_data)
pred <- predict(mod, newdata = data.frame(x = u[1], y = u[2]))
as.numeric(pred)
}
# plot the scalar data with the estimated gradient
ggplot(grid) +
ggfunction::geom_function_2d_1d(aes(x = x, y = y, z = z)) + ## sneak peak
geom_gradient_field(fun = scalar_field) +
scale_color_viridis_c()library(gganimate)
p <- ggplot() +
geom_stream_field(fun = f, xlim = c(-3,3), ylim = c(-3,3),
arrow = NULL, center = FALSE, normalize = FALSE,
n = 20,
L = 10,
tail_point = TRUE,
grid = "hex"
) +
theme_void() +
transition_reveal(along = after_stat(t))
animate(p, nframes = 30, fps = 10, duration = 10)ggplot2 Layer Expectationsggplot2 Layer ExpectationsData
geom_point()
geom_line()
geom_segment()
geom_smooth()
geom_stream()
Functions
geom_function()
geom_stream_field()
Install from Github
Load the package
A PDF \(f_X(x)\) satisfies
\(\Pr(a \le X \le b) = \int_a^b f_X(x) \, dx, \quad \int_{-\infty}^{\infty} f_X(x) \, dx = 1\)
For example, the standard normal PDF is:
\(f(x) = \frac{1}{\sqrt{2\pi}} \exp\left(-\frac{x^2}{2}\right)\)
The CDF \(F_X(x)\) is defined as
\[ F_X(x) = \Pr(X \le x) = \int_{-\infty}^{x} f_X(t) \, dt. \]
It is non-decreasing and maps \(\mathbb{R}\) to \([0,1]\).
The quantile function \(Q(p)\) is given by:
\[ Q(p) = \inf\{ x \in \mathbb{R} : F_X(x) \ge p \}, \quad p \in [0,1]. \]
Parametric curves define a mapping from a scalar parameter \(t\) to 2D coordinates:
\[ x = x(t), \quad y = y(t). \]
For example, the unit circle can be defined as:
\[ x(t) = \sin t,\quad y(t) = \cos t,\quad t \in [0,2\pi]. \]
A scalar field maps 2D input to a scalar output:
\[ f : \mathbb{R}^2 \to \mathbb{R}, \quad f(x,y) = z. \]
The function’s value at each \((x,y)\) can be visualized with color gradients.
Visualize the scalar field \(f(x,y)=\sin(x)\cos(y)\) with:
A wrapper for geom_stream_field()
Completed
geom_stream()
geom_vector()
geom_stream_field() geom_vector_field() geom_gradient_field()
geom_potential()
geom_vector_smooth() geom_stream_smooth() geom_gradient_smooth()
Future
Publish Papers
Publicize Package
geom_complex_function()
geom_stream_field(se = TRUE)
Committee
- Dr David Kahle
- COL (R) Rod Sturdivant, PhD
- Dr Keith Richards
- Dr Joshua Patrick
Peers
West Point
Family
LTC Dusty Turner